The goal here is to create a prediction model for the lead level over the Meuse River region
This prediction model will be graphically portrayed with a:
ggplot2 over an actual mapggplot2 over an actual mapplotly surface plotThe Log transformation is used to make the data more normally distributed, it is required because it creates predicted values that more closely resemble the the observed values. ##Lead-Levels It can be seen that there are pockets of high lead levels on the shore of the river at three points.
The following code will automatically install packages that are not already installed and load them into the library:
if(require('pacman')){
library('pacman')
}else{
install.packages('pacman')
library('pacman')
}
pacman::p_load(sp, EnvStats, gstat, ggplot2, rmarkdown, reshape2, ggmap,
RColorBrewer, parallel, dplyr, plotly, proj4, rgdal)
Installing package into 㤼㸱C:/Users/qb3jl/Documents/R/win-library/3.4㤼㸲
(as 㤼㸱lib㤼㸲 is unspecified)
cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/src/contrib/PACKAGES.rds': HTTP status was '404 Not Found'cannot open URL 'http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/3.4/PACKAGES.rds': HTTP status was '404 Not Found'trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/proj4_1.0-8.zip'
Content type 'application/zip' length 311182 bytes (303 KB)
package ‘proj4’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Users\qb3jl\AppData\Local\Temp\RtmpkPn2H2\downloaded_packages
proj4 installed
Note that meuse data set is included in the sp package and is not a data frame but an sp class, which is very similar to a data frame
#Inspect the Data Set
data(meuse)
head(meuse)
#Assignments
lon <- meuse$x
lat <- meuse$y
cd <- meuse$cadmium
cu <- meuse$copper
pb <- meuse$lead
zn <- meuse$zinc
The meuse Data set is an sp class objects and requires for its co-ordinates to be set:
rm(meuse); data(meuse)
coordinates(meuse) = ~x + y
The co-ordinate reference system is in meters north and meters east, this is going to be incompatible with the ggmap function, so it will have to be transformed first.
The transformation can be acheived by changing the proj4string. This is provided in the ?meuse help page:
proj4string(meuse) <- CRS("+init=epsg:28992")
meuse.ll <- spTransform(meuse, CRS("+init=epsg:4326"))
lon <- meuse.ll@coords[,1]
lat <- meuse.ll@coords[,2]
In order to create a krige prediction model two things are required:
Using an ordinary kriging method without any surface model provided:
lead.vg <- variogram(object = lead~1, data = meuse.ll)
plot(lead.vg)
head(lead.vg)
np dist gamma dir.hor dir.ver id
1 57 0.07929104 3881.474 0 0 var1
2 299 0.16397078 7317.871 0 0 var1
3 419 0.26736014 7953.443 0 0 var1
4 457 0.37272890 10144.515 0 0 var1
5 548 0.47856656 11379.696 0 0 var1
6 533 0.58553009 13875.099 0 0 var1
#This is what would be used for the northing/easting CRS
lead.vg.fit <- fit.variogram(lead.vg, model = vgm(1, "Sph", 900, 1))
No convergence after 200 iterations: try different initial values?
#For the new one, this method was suggested on stackexchange:
# .../226048/r-gstat-warning-singular-model-in-variogram-fit
lead.vg.fit <- fit.variogram(lead.vg,
model = vgm(psill=max(lead.vg$gamma)*0.9,
model = "Sph",
range=max(lead.vg$dist)/2,
nugget = mean(lead.vg$gamma)/4))
Thus the variogram model needed for the krige prediction has been ascertained
This will essentially act as the resolution of the overlay, Most monitors are about 140 ppi, hence assuming the plot will be a 9 inch square, about 800 x 800 pixels should look right, but 200 x 200 will be the used due to performance.
xgrid <- seq(min(lon), max(lon), length.out = 200)
ygrid <- seq(min(lat), max(lat), length.out = 200)
This domain base however has to be a spatial object, this can be acheived by defining co-ordinates in the data frame
xy.surface <- expand.grid(lon = xgrid, lat = ygrid)
coordinates(xy.surface) = ~lon + lat
proj4string(xy.surface) <- proj4string(meuse.ll)
head(xy.surface, 3)
SpatialPoints:
lon lat
1 5.723190 50.95661
2 5.723390 50.95661
3 5.723591 50.95661
Coordinate Reference System (CRS) arguments: +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs
+ellps=WGS84 +towgs84=0,0,0
Krige predictions can be more accurate if the data is normally distributed.
Hence the case study used a log transformation on the Zn value, this is because the Zn value was non-normally distributed and the log transformation somewhat fixed that.
par(mfcol = (c(1,2)))
hist(zn, main = "Zinc Value")
hist(log(zn), main = "Log-Transformation of Zinc Value")
An inspection of the Pb values shows that the log-transformation may be somewhat more normal, hence it will be used in the analysis.
par(mfcol = (c(1,2)))
hist(pb, main = "Lead Value")
hist(log(pb), main = "Log-Transformation of lead Value")
lead.pred <- krige(formula = lead ~1,
locations = meuse.ll,
newdata = xy.surface,
model= lead.vg.fit.exp)
[using ordinary kriging]
lead.log.pred = krige(log(lead)~1, meuse.ll, xy.surface, model = lead.vg.fit)
[using ordinary kriging]
lead.pred = krige(lead~1, meuse.ll, xy.surface, model = lead.vg.fit)
[using ordinary kriging]
par(mfcol = c(2,2))
hist(pb, main = "Lead Values",
xlab = "Lead (ppm)")
hist(lead.pred$var1.pred, main = "Prediction using Lead Values",
xlab = "Lead (ppm)")
hist(pb, main = "Lead Values",
xlab = "Lead (ppm)")
hist(exp(lead.log.pred$var1.pred),
main = "Prediction using Log Transformation",
xlab = "Lead (ppm)")
It can be seen that the prediction using the log transformation has a shape that (ever so slightly) more closely resembles the distribution of lead values, hence it is used:
lead.log.pred$var1.pred <- exp(lead.log.pred$var1.pred)
head(lead.log.pred)
coordinates var1.pred var1.var
1 (5.72319, 50.95661) 188.1950 11103.12
2 (5.72339, 50.95661) 190.2934 10833.76
3 (5.723591, 50.95661) 192.4756 10560.74
4 (5.723791, 50.95661) 194.7312 10285.57
5 (5.723991, 50.95661) 197.0453 10010.13
6 (5.724191, 50.95661) 199.3800 9736.67
Now there is a prediction model, if that prediction model is extrapolated over our domain (xy.surface) the various plots can be created
In order to create a heatmap, it is first necessary to take the sp prediction object and make it into pixel data, otherwise the heatmap won’t look quite right:
class(lead.pred)
[1] "SpatialPointsDataFrame"
attr(,"package")
[1] "sp"
gridded(lead.pred) = TRUE
class(lead.pred)
[1] "SpatialPixelsDataFrame"
attr(,"package")
[1] "sp"
spplot(lead.pred["var1.pred"],main="Kriging Predictions of lead levels (ppm) in Meuse River")
First it is necessary to create a data frame from the sp object:
lead.pred.df <- cbind(lead.pred@coords, lead.pred@data)
lead.pred.df.tidy <- melt(lead.pred.df)
No id variables; using all as measure variables
head(lead.pred.df)
ggplot() +
geom_tile(data = lead.pred.df, aes(lon, lat, fill = var1.pred)) +
labs(fill = "Lead (ppm)") #
This plot would be easier to read if it had discrete intervals rather than a continuous scale:
lead.pred.df$bin <- cut(lead.pred.df$var1.pred,
breaks = 15)
ggplot() +
geom_tile(data = lead.pred.df, aes(lon, lat, fill = bin)) +
labs(fill = "Lead Levels (ppm)",
title = "Heatmap of Lead levels (ppm)",
x = "Longitude",
y = "Latitude")
NA
Now by adjusting the colour scale and opacity, this can be used as an overlay for the map from exercise 10.1:
lead.pred.df$bin <- cut(lead.pred.df$var1.pred,
breaks = 9)
#Get a map
bbox <- make_bbox(lead.pred.df$lon, lead.pred.df$lat, f = 0.01)
map <- get_map(location = bbox, maptype = 'watercolor')
maptype = "watercolor" is only available with source = "stamen".
resetting to source = "stamen"...
Map from URL : http://tile.stamen.com/watercolor/15/16904/10971.jpg
cannot remove file '674296abddf1275cb35a8d99251df2a2.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10971.jpg
cannot remove file 'c2726ef941eca027ed57a5c995986c7f.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16906/10971.jpg
cannot remove file '20aa44a7b61408bd1860b6d5fca6002c.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16907/10971.jpg
cannot remove file '68da04da1540cf4675545c4a224f8b80.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16908/10971.jpg
cannot remove file '48a5d8b6a107cc29242dc2df62144e10.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16904/10972.jpg
cannot remove file 'ef08b15404eebd3995a7d9267621f6ca.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10972.jpg
cannot remove file '02a38509cb1fc57d1d9b4e44cdcf7663.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16906/10972.jpg
cannot remove file '5624e5f2c76636afe0d374eaeb029bf8.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16907/10972.jpg
cannot remove file '2fb11224ce5f00bab3147dbe0c89f114.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16908/10972.jpg
cannot remove file '361a854f37b3dcd33988ca775970d4aa.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16904/10973.jpg
cannot remove file '3d16c190e71e33063578628e4ca139d9.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10973.jpg
cannot remove file '82f35acb5ce2c665f97a989a4807b703.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16906/10973.jpg
cannot remove file '7a8a0fe5be2307e7fe836b4ef35d0ac1.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16907/10973.jpg
cannot remove file '94c0c9dc4ae21b37ce2b81c46f4a7be3.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16908/10973.jpg
cannot remove file 'b48f06e8a773d7b11eddb9bd2c3b44d4.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16904/10974.jpg
cannot remove file '97efd5f84be1236b56af56fce821ed8e.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10974.jpg
cannot remove file '624eb966399945020a1f156db9584b38.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16906/10974.jpg
cannot remove file '9a116aa432a85b6568582d630d54c3d6.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16907/10974.jpg
cannot remove file '07da421166a46dad21df07b3eb58040c.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16908/10974.jpg
cannot remove file '6b7edb962f7518a80e65fbf6d033a3b4.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16904/10975.jpg
cannot remove file 'b40a134db57039ced6ebc623c8174b07.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10975.jpg
cannot remove file 'cb77ab9ed11cb603f8393d2974ce9ef2.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16906/10975.jpg
cannot remove file 'b08c5e3bb62c74e692106c0caccff1c6.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16907/10975.jpg
cannot remove file 'd77af6dc50eae922f7cc4fd3071f4112.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16908/10975.jpg
cannot remove file '9511d153129611e0f8d12678b9c224c2.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16904/10976.jpg
cannot remove file '608164d74d92063289fcd117b14d1876.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10976.jpg
cannot remove file '769c3f466dcfbeb2f6c696c64c071d87.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16906/10976.jpg
cannot remove file 'c96537180437eaa3c2eb81d764d72d5b.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16907/10976.jpg
cannot remove file 'becdfb8a4072a93592b53c910ad19755.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16908/10976.jpg
cannot remove file '4a7e18a78545a22911eb4be2ee531c14.rds', reason 'No such file or directory'
#Create the colours
cols <- brewer.pal(n = length(levels(lead.pred.df$bin)), name = "Oranges")
#Create the Plot
ggmap(ggmap = map) +
geom_tile(data = lead.pred.df, aes(lon, lat, fill = bin, alpha = var1.pred))+
scale_fill_manual(values = cols) +
labs(fill = "Lead Index",
x = "Longitude",
y = "Lattitude",
title = "Heatmap of Lead Index") +
guides(alpha=FALSE)
NA
In order to view the surface plot, contours can also be used.
Recall from Week 11/Exercise 10, that contour and surface plots work a little differently to ordinary plots.
A domain grid of equally spaced \(x\) and \(y\) values is required and a matrix of \(z\) values that can be overlayed onto that domain grid in order to provide the surface values required to create the surface plot.
The method from Lecture 10/Wk. 11 involved predicting over the square domain of xy.surface, this won’t work here because instead of the predict function, the krige function was used, hence it is necessary to convert the output from the krige function, into the output that would have been given by the predict function, which is the required \(z\) surface matrix.
This can be acheived with the data.matrix function:
z_surface_matrix <- data.matrix(lead.pred[1])
Using base R packages, a contour plot can be created with the contour function:
contour(xgrid, ygrid, z_surface_matrix,
xlab="-Longitude (degrees West)", ylab="Latitude (degrees North)",
labcex = 0.8)
mtext("Krige using a 4th degree polynomial model", cex = 0.5)
In order to create a contour plot in ggplot 2, it will be necessary to use ‘tidy’ data, which is a table of data such that each column corresponds to a variable:
rownames(z_surface_matrix) <- xgrid
colnames(z_surface_matrix) <- ygrid
lead.pred.melt <- melt(z_surface_matrix,
varnames = c("Longitude", "Latitude"),
value.name = "Lead_Level")
head(lead.pred.melt)
Observe that this is another way of getting the data frame that was required in part 1 (lead.pred.df), it is included here as another method and for the sake of completion, also it matches the method used in week 10/lecture 11.
Now using ggplot2 a contour plot, can be created:
ggplot(data = lead.pred.melt, aes(x = Longitude,
y = Latitude,
z = Lead_Level,
labs(x = "Longitude", y = 'Latitude',
title = 'Lead Levels (ppm)',
col = 'Lead Levels (ppm)', size = 'Lead Levels (ppm)') +
geom_contour(lwd = 0.1) +
theme_classic()
The advantage to using ggplot2, is the map overlay can be taken advantage of:
map <- get_map(location = bbox, maptype = 'watercolor')
maptype = "watercolor" is only available with source = "stamen".
resetting to source = "stamen"...
Map from URL : http://tile.stamen.com/watercolor/15/16904/10971.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16905/10971.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16906/10971.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16907/10971.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16908/10971.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16904/10972.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16905/10972.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16906/10972.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16907/10972.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16908/10972.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16904/10973.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16905/10973.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16906/10973.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16907/10973.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16908/10973.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16904/10974.jpg
cannot remove file '97efd5f84be1236b56af56fce821ed8e.rds', reason 'No such file or directory'Map from URL : http://tile.stamen.com/watercolor/15/16905/10974.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16906/10974.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16907/10974.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16908/10974.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16904/10975.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16905/10975.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16906/10975.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16907/10975.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16908/10975.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16904/10976.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16905/10976.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16906/10976.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16907/10976.jpg
Map from URL : http://tile.stamen.com/watercolor/15/16908/10976.jpg
ggmap(ggmap = map) +
labs(fill = "Lead Levels (ppm)",
x = "Longitude",
y = "Lattitude",
title = "Heatmap of Lead Levels (ppm)") +
guides(alpha=FALSE) +
geom_contour(data = lead.pred.melt, aes(x = Longitude,
y = Latitude,
z = Lead_Level),
col = 'grey40',
binwidth = 18,
size = 0.10) +
scale_fill_manual(values = cols)
This can be combined with the heatmap layer:
ggmap(ggmap = map) +
labs(fill = "Lead Levels (ppm",
x = "Longitude",
subtitle = "Meuse River") +
guides(alpha=FALSE) +
geom_tile(data = lead.pred.df, aes(lon, lat, fill = bin, alpha = var1.pred)) +
geom_contour(data = lead.pred.melt, aes(x = Longitude,
col = 'indianred',
binwidth = 19,
A surface plot allows a three dimensional view of the model
In base package, a surface plot can be made:
persp(xgrid, ygrid, z_surface_matrix,
theta = -45, phi = 30, d = 0.5,
xlab="-Longitude (degrees West)",
ylab="Latitude (degrees North)",
zlab="Lead Levels (ppm)", ticktype = "detailed")
title(main=paste("Surface Plot of Lead Levels (ppm)", "Loess Smoothing", sep="\n"))
Plotly has the advantage of being fully interactive and not as difficult to constrain within a plot/box area:
p <- plot_ly(x = ygrid, y = xgrid, z = z_surface_matrix) %>%
add_surface() %>%
layout(
title = "Lead Levels over Meuse River",
scene = list(
xaxis = list(title = "Longitude"),
yaxis = list(title = "Latitude"), #The domain here, do it above
zaxis = list(title = "Lead Levels (ppm)")
))
The surface plot re-inforces the fact that the lead levels are seemingly lowest within the river with spikes at certain shore locations.